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Abstract — A  joint  order  detection  and  blind  estimation  algo¬ 
rithm  for  single  input  multiple  output  channels  is  proposed.  By 
exploiting  the  isomorphic  relation  between  the  channel  input  and 
output  subspaces,  it  is  shown  that  the  channel  order  and  channel 
impulse  response  are  uniquely  determined  by  finite  least  squares 
smoothing  error  sequence  in  the  absence  of  noise.  The  proposed 
subspace  algorithm  is  shown  to  have  marked  improvement  over 
existing  algorithms  in  performance  and  robustness  in  simulations. 

Index  Terms —  Blind  channel  identification,  least  squares 
method. 


I.  Introduction 

ONE  OF  THE  most  important  requirements  for  blind 
channel  estimation  and  equalization  is  the  speed  of  con¬ 
vergence.  This  is  especially  the  case  when  it  is  used  in  packet 
transmission  systems  where  only  a  small  number  of  data 
samples  are  available  for  processing.  Among  blind  channel 
estimation  techniques  developed  recently  [12],  those  based 
on  the  so-called  deterministic  models  have  a  clear  advantage 
in  the  speed  of  convergence.  Without  assuming  a  specific 
stochastic  model  of  the  input  sequence,  these  “deterministic” 
techniques  are  capable  of  obtaining  perfect  channel  estimation 
within  a  finite  number  of  samples  in  the  absence  of  noise. 
Such  a  finite-sample  convergence  property  comes  mainly  from 
the  exploitation  of  the  multichannel  structure  first  used  in 
[13].  Existing  algorithms  with  this  attractive  feature  include 
the  subspace  (SS)  algorithm  [7],  the  cross  relation  (CR)  (also 
referred  to  as  the  least  squares)  algorithm  [16],  the  EVAM  [5], 
the  two-step  maximum  likelihood  (TSML)  approach  [6],  and 
the  linear  prediction-subspace  (LP-SS)  algorithm  proposed  by 
Slock  [9], 

Existing  algorithms  with  the  finite-sample  convergence 
property  share  a  common  difficulty:  the  determination  of 
channel  order.  While  many  order  detection  algorithms  can  be 
applied  (see,  e.g.,  [15]  and  references  therein),  the  approach 
of  separate  order  detection  and  channel  estimation  may  not 
be  effective,  especially  when  the  channel  impulse  response 
has  small  head  and  tail  taps.  Addressing  this  issue,  a  class  of 
channel  estimation  algorithms  based  on  the  linear  prediction 
(LP)  interpretation  of  multichannel  moving-average  processes 
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have  been  proposed  first  by  Slock  [9],  followed  by  Slock  and 
Papadias  [10],  Abed-Meraim  et  al.  [1],  [3],  and  more  recently, 
by  Gesbert  and  Duhamel  [4].  Although  only  the  upper  bound 
of  the  channel  order  is  required,  these  algorithms,  with  the 
exception  of  the  (LP-SS)  approach  [9],  which  still  requires 
the  knowledge  of  the  channel  order,  suffer  considerable 
performance  loss  due  to  the  requirement  that  the  input 
sequence  is  white.  Consequently,  these  algorithms  need  a 
relatively  large  sample  size  for  accurate  channel  estimation, 
which  causes  the  loss  of  finite-sample  convergence  property. 
Further,  we  may  argue  that  although  these  algorithms  provide 
a  consistent  estimate  with  only  the  knowledge  of  the  bound 
of  the  channel  order,  overdetermination  of  the  channel  order 
does  affect  the  performance  when  the  sample  size  is  finite. 

The  contribution  of  this  paper  is  twofold.  First,  by  ex¬ 
ploiting  the  isomorphic  relation  between  the  input  and  output 
subspaces,  we  introduce  a  geometrical  approach  to  linear 
least  squares  smoothing  channel  estimation  that  preserves  the 
finite  sample  convergence  property.  This  geometrical  approach 
provides  a  simple  and  unified  derivation  of  different  LP-based 
channel  estimators.  Second,  we  develop  a  joint  order  detection 
and  channel  estimation  algorithm  that  aims  to  minimize  the 
smoothing  error  by  jointly  choosing  the  channel  order  and 
coefficients.  When  compared  with  existing  approaches,  the 
proposed  algorithm  provides  considerable  improvement  in 
convergence  over  LP-based  approaches.  There  is  also  marked 
improvement  over  CR  and  SS  algorithms  in  robustness  against 
the  loss  of  channel  diversity. 

This  paper  is  organized  as  follows.  Section  II  presents  a  list 
of  key  notations  followed  by  the  channel  model.  Geometrical 
properties  of  least  squares  smoothing  based  on  the  isomorphic 
relation  between  the  output  and  input  subspaces  are  presented 
in  Section  III.  In  Section  IV,  we  present  a  general  formulation 
of  LSS,  data  structures  used  in  algorithm  development  and 
their  properties,  and  a  joint  order  detection  and  channel  estima¬ 
tion  algorithm.  Simulation  results  are  presented  in  Section  V, 
where  we  compared  the  proposed  algorithm  with  existing 
techniques.  In  conclusion,  we  comment  on  the  strength  and 
weakness  of  the  proposed  approach. 

II.  The  Model  and  Preliminaries 
A.  Notations 

Notations  used  in  this  paper  are  mostly  standard.  Signals 
are  discrete-time  and  complex  in  general.  We  use  ,r(z)  to 
denote  the  ^-transform  of  signal  ./■, .  and  ./■,  *  yt  stands  for 
the  convolution  of  ./■,  and  yt.  Upper-  and  lower-case  bold 
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Fig.  1.  Single-input  multiple-output  linear  system. 


letters  denote  matrices  and  vectors,  respectively.  (•)*  and  (•)' 
are  transpose  and  Hermitian  operations.  Matrix  0mxn  stands 
for  the  mxn  zero  matrix.  Given  a  matrix  A,  1Z{A)  ( C{A })  is 
the  Row  (column)  space  of  matrix  A.  For  a  matrix  X  having 
the  same  number  of  columns  as  A ,  Vj[{X\  {V^{X{)  is 
the  projection  of  X  onto  (orthogonal  complement  of)  the  row 
space  of  A.  For  a  set  of  vectors  x±,  •  •  •  ,xni  spjari,  •  •  •  ,xn} 
denotes  the  linear  subspace  spanned  by  xi,  •  •  •  ,xn.  ||  •  || 
denotes  the  2-norm.  Finally,  V„ (xi,  •  •  *fXm)  denotes  the  nxm 


Vandermonde  matrix  specified  by  {.xt-  -  - 

5  Pm} 

f  i  ... 

1  \ 

"^n(^l  •}  ’ 

\  4 

*  d  Pm  )  — 

Xi 

Pm 

or1  ••• 

rpTl  1  J 

B.  The  System  Model 

The  identification  and  estimation  of  a  single  input  P-output 
linear  system  (channel),  shown  in  Fig.  1,  using  only  the 
observation  data  is  considered  in  this  paper.  The  system  is 
described  by 

L 

x(t)  -  ^2  hist-i 
i= 0 

y(t)  =  x(t)  +  n(t),  t  —  1,2,  ■  ■  ■  ,N  (2) 

where  x (t)  —  [a:^,  •  •  • , x\I>'^]t  is  the  (noiseless)  channel 
output,  n(t)  is  the  additive  noise,  y(t)  is  the  received  signal, 
\ht  —  ,  •  •  • ,  }  is  the  channel  impulse  response,  and 

St  is  the  input  sequence.  Consider  a  block  of  w  samples  of  the 
observation  in  (2),  and  let  yw(t)  —  [y*(f),  ■  ■  ■  ,yt(t  —  w  + 1)]*. 
With  x w(t),  SL+W(t),  and  nw(t )  similarly  defined,  we  have 

xw  (t)  =  Pw  (h)sL+w  (t) ,  yw  (t)  =  xw  (t)  +  nw  (t)  (3) 

where  the  wPx  ( w+L )  complex  matrix  P (h)  is  the  so-called 
filtering  matrix 

(ho  ■■■  flL  \ 

...  ••.  •  (4) 

ho  ■■  ■  h^  ) 

Our  goal  is  to  estimate  h  =  [h'L ,  ■  ■  ■ ,  h'o]'  from  yw(t),t  = 
lp  -  -  ,N.  All  signals  are  deterministic,  although  most  results 
can  be  generalized  to  statistical  models  of  the  input  and  noise. 


The  following  two  assumptions  (one  on  the  system,  the  other 
on  the  input  sequence)  are  made  throughout  the  paper. 

Al)  There  exists  a  (smallest)  wa  such  that  the  filtering 
matrix  XWo(h)  has  full  column  rank. 

A2)  The  input  sequence  st  has  linear  complexity  [2]  greater 
than  L*  —  2wo  +  2 L,  i.e., 

SL,-L+ 1  •••  Sn\  ) 

:  Toeplitz  I  /  —  L*  +  1.  (5) 

Sl-L  /  ) 


Assumption  Al,  which  was  first  exploited  in  [13],  is  neces¬ 
sary  for  all  methods  based  on  (general)  deterministic  modeling 
of  the  input  sequence.  Specifically,  if  Al)  is  not  satisfied,  there 
exists  a  different  {fit,  St}  such  that  ht  *  st  —  ht  *  St ,  i.e.,  two 
channels  and  their  inputs  produce  the  same  noiseless  observa¬ 
tion  and  are  unidentifiable  from  the  observation.  Implications 
of  Al)  are  summarized  below. 

Property  1:  Under  Af),  we  have  the  following: 

PL  1)  The  subchannel  transfer  functions  do  not  share  com¬ 
mon  zeros,  i.e.,  {//,(;;)}  are  co-prime. 

PI. 2)  X w  (h)  has  full  column  rank  for  all  w  >  wa. 

PI. 3)  If  P  —  2,  then  wa  —  L.  In  general,  wa  <  L. 

Assumption  A2)  ensures  that  the  input  sequence  is  suffi¬ 
ciently  complex  to  excite  the  channel,  and  it  is  related  to 
the  persistent  excitation  condition.  The  minimum  required  for 
the  smoothing  technique  presented  in  this  paper  is  assumed 
here.  Larger  complexity  may  be  necessary,  depending  on 
the  implementations,  which  will  be  pointed  out  later  in  our 
discussion.  We  note  here  that  A2)  is  stronger  than  necessary. 
It  is  shown  in  [11]  that  when  P  —  2,  the  necessary  and 
sufficient  condition  for  the  unique  identification  of  the  channel 
and  its  input  is  Al)  and  that  the  input  sequence  st  has 
linear  complexity  greater  than  2 L.  The  reason  that  a  stronger 
condition  is  required  due  largely  to  the  smoothing  approach 
that  requires  both  future  and  past  data. 


III.  Geometrical  Properties 
of  Least  Squares  Smoothing 

The  essential  idea  behind  the  linear  prediction  and  smooth¬ 
ing  approaches  to  channel  estimation  rests  on  the  isomorphic 
relationship  between  the  output  and  the  input  subspaces.  It  is 
this  isomorphic  relation  of  the  two  spaces  that  allows  us  to 
avoid  the  direct  use  of  input  sequence,  using  instead  the  input 
subspace  that  can  be  obtained  from  (noiseless)  observation. 
Our  presentation  relies  heavily  on  geometrical  intuition.  It  is 
therefore  necessary  to  begin  with  precise  definitions  of  relevant 
variables  and  spaces. 


A.  Key  Variables,  Spaces,  and  Isomorphic  Relations 

From  (2),  let  st  be  the  row  vector  of  input  symbols  and 
Xt  be  the  data  matrix  of  the  noiseless  observation  defined, 
respectively,  by 


st 


[st,  st+i. 


Xt 


[x(t),x(t+l),- 


(6) 
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x(t) 


c(t  +  1) 


=n 


:  Block 

x  (t  —  p  +  1)  Toeplitz 
K{[xp(t),xp(t+1),-  ••]}. 


(8) 


The  above  definition  also  applies  to  p  <  0,  in  which  case, 
we  have  the  span  of  \p\  future  data  vectors.  It  is  also  useful 
to  note  that 


1)  The  Use  of  Input  Subspaces:  From  (2)  and  (6),  we  have 

xt  —  host  +  hist-i - 1-  hLst-L-  (11) 

To  avoid  cumbersome  boundary  problems,  we  assume  for  the 
moment  the  data  size  is  infinity,  i.e.,  N  —  oo.  Let  St  be  the 
subspace  that  includes  all  future  and  past  input  data  except 
st ,  i.e., 


Sf  _p  —  St—  p,  Xt  _p  —  (9) 

Given  a  linear  subspace  S ,  the  orthogonal  projection  st  of  st 
and  its  projection  error  st  are  defined  by 


*  A 

St  -  sp{-  •  • ,  st_i}  U  sp{st+ 1,  •  •  •}  =  St -1,00  u  St+l-oo- 

(12) 

Projecting  xt+,  onto  St  with  only  h,s>  not  contained  in  St, 
we  have 


st\s  =  argnuu  ||st  -  *||2,St|.s  =  st  -  st \s.  (10) 

Similarly,  the  (row-wise)  projection  xt\x  of  onto  a  linear 
subspace  X  is  a  matrix  whose  rows  are  projections  of  x t  onto 
X. 

Playing  a  critical  role  in  the  smoothing  as  well  as  LP-based 
approaches  is  the  equivalence  between  the  input  and  output 
spaces  as  a  result  of  Al)  and  PI. 2).  Specifically,  we  have  the 
following. 

Properties  2:  Under  Al),  for  w  >  w0,  Xt>lu  —  St:L+w,  i.e., 
Xt  w  is  isomorphic  to  St}L+w  with  isomorphism  X w  (h) . 

In  general,  Xt>w  C  St  l+w  for  any  w.  This  implies  that 
given  a  fixed  observation  window  w,  the  input  space  $t,L+w 
may  not  be  “seen”  completely  from  the  output  space  Xt>w. 
On  the  other  hand,  with  Al ),  all  the  information  of  the  input 
space  is  contained  in  the  output  space  Xtw  when  subchannels 
do  not  have  common  zeros  [Pl.l)]  and  w  is  chosen  large 
enough.  Such  equivalence  enables  us  to  replace  the  direct 
use  of  input  sequence  by  the  use  of  observation  in  channel 
estimation.  Interestingly,  when  Al )  does  not  hold,  Xt _ may 
still  be  a  good  approximation  of  St,L+w>  which  is  one  of  the 
reasons  that  the  algorithm  proposed  here  offers  considerable 
improvement  in  robustness  over  existing  methods  such  as  the 
subspace  algorithm  (SS)  [7]  and  cross  relation  (CR)  algorithm 
[16]. 

B.  Least  Squares  Smoothing — The  Basic  Idea 

The  isomorphism  between  the  input  space  St,w  and  the 
output  spaces  Xt  w  leads  to  the  following  question:  Can  the 
channel  be  identified  from  Si  without  the  direct  use  of  the  in¬ 
put  sequence  St  and  how?  Without  going  into  implementation 
details,  we  explain  in  this  section  how  this  can  be  achieved 
by  properly  constructing  subspaces  that  contain  both  past  and 
future  data,  namely,  by  smoothing.  LSS  algorithms  and  their 
implementation  issues  are  presented  in  Section  IV. 


Xt+i\St  hkSt+i—k 

(13) 

k,k^i 

't+i\.i't.L  ~Xt+i  -  xt+i  —  hi-St\St- 

(14) 

The  above  process  is  illustrated  in  Fig.  2.  The  similarity  of 
two  right  triangles  immediately  suggests  (14). 

Note  that  the  projection  error  St\^t  of  st  is  independent  of 
i.  Consequently,  we  have 

fXt+L\Si  \ 

^  —  I  I  —  ^t\sp 
V  xt\st  / 

From  E.  there  are  several  ways  of  finding  h  up  to  a  scaling 
factor,  and  they  have  different  performance  when  there  is 
noise  and  when  other  implementation  issues  are  considered. 
We  remark  that  because  subspaces  are  invariant  with  respect 
to  scaling,  the  identification  of  h  up  to  a  scaling  factor  using 
only  the  input  subspace  is  the  best  we  should  expect.  One 
approach  is  the  least  squares  fitting  of  the  column  space  of  E: 

h  =  arg  max  ll/LEll2.  (16) 

11*11=1 


The  above  optimization  can  be  obtained  by  the  singular  value 
decomposition  of  either  E  or  the  sample  covariance  of  the 
projection  error  sequence  lit  —  (1/ M)EE' ,  where  M  is  the 
number  of  columns  in  E. 

There  is  an  interesting  connection  with  the  conventional  LS 
approach  when  the  input  sequence  is  known.  Indeed,  were  the 
input  sequence  available,  the  LS  channel  estimate  would  have 
been 

hLS  =xtS,(SS,r 1  =  Es't 


5  = 


(17) 
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Fig.  3.  Isomorphism  between  input  and  output  subspaces. 


which  is  the  same  as  that  in  (16).  The  second  equality  is 
obtained  by  using  the  formula  of  inverting  matrices  with 
subblocks  [8,  p.  413,  A20.]. 

2)  Use  of  Output  Spaces-Least  Squares  Smoothing:  The 
identification  procedure  presented  above  requires  the  projec¬ 
tion  of  x t  onto  the  input  space  6(.  Because  Xt_w  is  isomorphic 
to  St  w,  the  application  of  the  above  approach  using  only 
the  received  signal  requires  only  a  careful  construction  of  the 
output  subspace  that  is  isomorphic  to  St,  Specifically,  using 
Property  2  and  (9),  we  have 

^t, oo  —  Xt, ooi  cH+l,— oo  —  r^t+L+1,— oo  (18) 

Xt,L  —  X- 1,00  u  Xt+L+ 1,-00  =  St-  (19) 

The  isomorphism  between  the  input  and  output  subspaces  is 
illustrated  in  Fig.  3.  The  idea  of  smoothing  arises  naturally 
as  the  projection  of  xt  onto  St  in  (14)  is  equivalent  to  the 
projection  of  xt  onto  the  output  subspace  spanned  by  all  the 
past  data  X- 1:0©  and  future  data  Xt+L+i,—oo-  From  (15),  we 
have  the  identification  equation  of  the  LSS  approach  [14] 

pt+L \.X  \ 

E  =  I  =  hst \xt  (20) 

V  it\Xt  J 

where  the  left-hand  side  can  be  obtained  from  the  observation 
alone. 


IV.  Algorithm  and  Implementations 

In  this  section,  we  provide  more  details  about  the  LSS 
approach  including  its  properties  and  implementations.  We 
begin  with  a  general  formulation  of  LSS  that  forms  the  basis 
of  our  approach.  Data  structures  of  the  LSS  approach  are 
specified  along  with  their  properties.  We  then  derive  a  joint 
order  detection  and  channel  estimation  algorithm  and  discuss 
its  implementations. 

A.  General  Formulation  of  LSS 

The  projection  space  A’i  l  defined  in  (19)  requires  the 
knowledge  of  channel  order.  We  consider  here  a  more  general 
formulation  of  the  problem  by  defining  slightly  different 
projection  spaces  that  enable  us  to  deal  with  practical  issues 
such  as  finite  sample  size  and  unknown  channel  order.  Instead 
of  using  the  projection  space  given  in  (19),  consider  the 
smoothing  of  (  +  1  observations  Xt+i,  i  —  0,  •  •  • ,  (  by  forward 


and  backward  predictors  of  order  w  >  Wo-  The  projection 
space  is  given  by 

X,i  —  Xt— U  Xf-\-i~\- 1?— «;  —  Xt—\7w  U  Xt-\-i-\-w,w •  (21) 

We  notice  that  A)  ;  is  essentially  the  same  as  that  defined  in 
(19),  except  that  we  treat  l  as  a  variable  not  necessarily  equal 
to  the  channel  order  L.  Because  of  the  isomorphic  relation 
between  the  output  and  input  spaces,  we  have,  using  (9) 

X,i  =  St-i,L+w  U  St+i+w,L+w  =  St, I-  (22) 


Therefore 


St, i  =  { 


sp{st—L—w-  ■  ■  ■ ;  t?ti  ■  ■  ■  ? 
l  <  L 

■sp{st-L  —W1  *  *  *  5  •»t  — l}  U  SPist+l-L+l,  ■  ■  ■  ,  -St+t+iu } 
l  >  L. 

(23) 


Projecting  x t+i,i  —  0,  •••,(  onto  Xt,i  —  St,i ,  we  have  the 
following  result  as  a  generalization  of  (15). 

Theorem  1:  Let  the  forward  and  backward  predictor  order 
w  >  wa.  Let  Xt,i  be  defined  in  (22),  and  let  Et,i  be  the 
projection  error  matrix  defined  by 


Then 


(x 


Et,i  = 


X 


t \xt. 


Et,i  =  { 


Hi{h)  ± 


l  <  L 
L  <  l 


(24) 


(25) 


1  columns 


Further,  if  {st}  has  linear  complexity  greater  than  2 w+l+L 
and  l  >  L 


C{EW+U}  =  C{Ht(h)}  (26) 

Proof:  From  (11),  (21),  (23),  we  have,  for  0  <  i  <  l 


xt+i\i't,i  ~xt+i\stJ 

0 


X/  hkit+i_k\St, 


h — i — l~\~L 


l  <  L 
l  >  L- 


(27) 

(28) 


With  hk  —  0  for  all  /,:  <  0  and  k  >  L,  the  above  equation 
leads  to  (25). 
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To  prove  (26),  we  need  to  show  that  the  projection  error 
matrix  of  the  input  sequence  in  (25)  has  full  row  rank.  Consider 
the  Toeplitz  matrix 


5  = 


(  S2  i„+;+i  •  •  •  sn  ^ 

A 

Siv-yl— L+2 
Sw+l— L+l  ’  ’  ’ 

B 

SM>  +  1 


c 

\  Sl-L 

With  t  —  w  +  1  in  (25),  we  note  that 


(29) 


£+1|<Su’+i,j  \  /  A\ 

=  Vb{B},D±\£y  (30) 

Su.+l|5u,  +  1,i  / 

When  {st}  has  linear  complexity  greater  than  2 w  +  l  +  L,  S 
has  full  row  rank,  which  implies  that  has  full  row 

rank.  We  now  have  (26).  □  □  □ 

The  above  result  holds  the  key  to  our  approach,  especially 
when  the  channel  order  L  is  unknown.  When  the  smoothing 
window  size  l  is  too  small,  the  smoothing  error  contains  no 
information  about  the  channel  because  all  input  sequences  are 
in  the  output  space.  When  I.  —  L,  we  have  the  case  described 
in  Section  III-B,  where  the  channel  vector  spans  the  column 
space  of  Eu.  When  the  window  size  l  is  greater  than  the 
channel  order  L ,  the  projection  space  misses  more  than  .s, , 
which  complicates  the  channel  identification.  Nonetheless,  it 
is  shown  in  Section  IV-C  that  the  column  space  of  Et>i  still 
uniquely  determines  the  channel  vector,  which,  along  with 
another  useful  property  of  Eu,  forms  the  basis  of  a  joint 
order-detection  and  channel  estimation  algorithm  that  requires 
only  the  upper  bound  of  the  channel  order. 


Denote  the  “future-past”  data  matrix  D as 

Pw,i  =  b/w(w),  •  •  • ,  yw(N  -w-l-l)] 

Fw,i=  [yw(2w+l  +  l),--uyw(N)\  (32) 

Yw,i  =  [ yw{w  +  1  + 1), •  •  -,yw(.N  ~  w )] 

(33> 


To  see  the  relation  between  these  data  matrices  and  various 
spaces,  we  summarize  their  properties.  The  rank  conditions 
given  below  are  useful  in  dealing  with  noise  by  finding  the 
least  squares  approximation  of  the  noisy  data  matrix. 

Property  3:  Suppose  that  the  input  sequence  has  linear 
complexity  greater  than  2w  +  l  +  L  +  1  and  there  is  no  noise. 


For  w  >  w0,  we  have  the  following  properties. 

P3.1)  Data  Matrix  Z xv  i'. 

rank(Zwj)  =  2w  + 1  +  L  +  1.  (34) 

P3.2)  Past  Data  Matrix  Pw  i : 

P{Pw,i}  =XW,W  —  $W,W+L  (35) 

rank  (Pw  :l)  =  L  +  w.  (36) 

P3.3)  Future  Data  Matrix  Fwj 

P\E  w,l\  —  A2w-\-l-\-l,w  —  (37) 

rank^u,,;)  =  rank(F(Ui;)  —  L+w.  (38) 


P3.4)  Projection  Data  Matrix  Dwj 


P{Dw,l}  — S w,w-\-L  kJ  Sw-\-[—L-\-1,  —  w—L  — 

2  w  l  L  1  l  ^  L 

2 w  +  2 L,  L  <  l  <w. 


rank  (Dwj) 


(39) 

(40) 


Proof:  See  the  Appendix. 


C.  J-LSS:  Joint  Order  Detection  and 
Channel  Estimation  via  LSS 


B.  Data  Structures 


We  consider  now  the  problem  of  estimating  the  channel 
using  only  a  finite  number  of  received  signal  samples  y(t),  t  — 
1,  •  •  •  ,N.  For  a  fixed  predictor  size  w  >  w0  and  smoothing 
window  (  >  0,  define  the  overall  data  matrix 


fy{2w  +  ( +  1)  •••  y{N)^ 

'■  Ew,i 

y(w  +  1  +  2) 

y(w  + 1  +  1) 


Zw,  = 


Y, 


y(w  + 1) 


w,l 


y(w) 

V  y(  1) 


w,l 


(31) 


from  which  we  have  defined  the  “current”  data  matrix  Yw  i, 
the  “past”  data  matrix  Pw  i,  and  the  “future”  data  matrix  Fw  i. 


If  the  channel  order  is  known  or  can  be  detected,  channel 
estimation  by  LSS  can  be  derived  directly  from  (20).  This 
approach  and  its  adaptive  implementations  are  explored  in 
[14],  [18],  and  [17],  Here,  we  describe  a  joint  order  detection 
and  channel  estimation  approach  based  on  Theorem  1  and 
the  data  structure  defined  above  assuming  only  that  an  upper 
bound  of  channel  order  is  available. 

The  idea  here  is  to  fit  the  smoothing  error  matrix  Ewj 
by  jointly  choosing  both  the  channel  order  and  the  channel 
impulse  response.  With  fixed  l  as  the  upper  bound  of  the  true 
channel  order  L ,  recall  Theorem  1  for  the  case  when  l  >  L. 
Consider  the  smoothing  error  matrix  Eiti  =  Ku{Yi,i} 
obtained  from  projecting  Yi  i  onto  the  row  space  of  Du.  We 
now  have  from  (26),  when  there  is  no  noise 

C{Ehi}  =  C{Ht(h)}.  (41) 

Letting  Q  —  [Qfl.  ■  ■  ■ .  Q{\  be  the  matrix  whose  row  vectors 
are  orthogonal  to  the  range  space  of  Eu,  we  then  have 

[Qo;  -  -  1  :Ql\El,l  —  0 
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which  implies 


/ Qo 

Ql  \ 

(hL\ 

Block 

! 

i  \=rL(Q)h  =  0. 

V  Hankel 

ft/ 

\hj 

Tl{Q) 

(42) 


In  other  words,  the  channel  coefficients  satisfy  a  homogeneous 
linear  equation.  What  remains  to  be  answered  is  whether  the 
solution  is  unique  up  to  a  scaling  factor.  The  proposed  joint 
order  detection  and  channel  estimation  algorithm  is  motivated 
by  the  following  Theorem. 

Theorem  2:  Assume  that  there  is  no  noise,  and  the  input 
sequence  has  linear  complexity  greater  than  3 1  +  L.  As¬ 
sume  also  that  the  channels  do  not  share  common  zeros.  Let 
Eij  —  'Pjjt  t  \Y uj  be  the  projection  error  matrix,  and  let 

Re  =  (1  /N  —  3l)Ei  iE>u  be  the  sample  covariance  matrix 
of  the  smoothing  error  sequence.  Let  the  rows  of  Q  be  the 
singular  vectors  associated  with  (P(J.  +  1)  —  I.  +  k  —  1) 
smallest  singular  values  of  Re,  and  let  Q  be  partitioned  by 
(P(l+l)  —  l+k  —  l)xP  submatrices  Q  =  [Q0,  •  •  • , Qj\.  Define 

(Qu  •  •  •  Qk\ 

W)=  Block  i  •  (43) 

V  Hankel  Q;  / 

Then,  the  homogeneous  linear  equation 

%{Q)z  =  0  (44) 


has  the  unique  nontrivial  solution  z  —  ah  when  k  —  L  and 
trivial  solutions  otherwise. 

Proof:  See  the  Appendix. 

We  note  that  the  above  result  does  not  apply  to  the  subspace 
algorithm.  When  k  —  L,  (44)  defines  a  channel  estimator 
that  bears  some  similarity  to  the  subspace  algorithm  used  by 
Moulines  et  al.  [7],  In  both  cases,  the  so-called  noise  subspace 
is  used  in  constructing  a  homogeneous  linear  equation  of 
which  the  channel  vector  is  a  unique  solution.  However,  there 
are  several  important  differences.  First,  the  filtering  matrix 
P w  (h)  used  in  the  subspace  approach  is  different  from  the 
smoothing  error  matrix  E;  i-  Maybe  more  importantly,  the 
homogeneous  equation  used  in  the  subspace  algorithm  has 
nontrivial  solutions  when  the  estimated  channel  order  is  larger 
than  the  true  channel  order,  which  is  the  reason  that  the 
joint  order  detection  and  channel  estimation  approach  does 
not  apply  to  the  subspace  algorithm  directly. 

It  is  perhaps  surprising  that  when  k  L,  (44)  has  only  the 
trivial  solution.  Intuitively,  we  can  argue  as  follows.  When  the 
channel  order  is  overdetermined,  i.e.,  k  >  L,  in  constructing 
Q ,  we  must  include  eigenvectors  that  are  in  the  range  space  of 
Ttiih ),  which  leads  to  inconsistency  of  QEi  i  —  0.  (Note  that 
such  inconsistency  does  not  occur  for  the  subspace  algorithm 
when  the  channel  order  is  overdetermined.)  For  a  generically 
chosen  channel,  T/.  (Q)  has  full  column  rank.  On  the  other 
hand,  when  the  channel  order  is  underestimated  k  <  L,  there 
are  an  insufficient  number  of  parameters  to  specify  the  null 
space  of  Re. 


Theorem  2  enables  us  to  define  the  following  joint  channel 
order  detection  and  estimation  criterion: 

{L,  h}  =  arg  min  \\Tk(Q)h\\2.  (45) 

Lll«ll=l 

The  above  optimization  has  a  closed-form  solution  involving 
the  singular  vector  associated  with  the  smallest  singular  value. 
The  joint  order  detection  and  channel  estimation  approach, 
which  is  referred  to  as  J-LSS,  is  summarized  in  Fig.  4. 

There  are  many  ways  of  implementing  the  algorithm  out¬ 
lined  in  Fig.  4.  We  discuss  here  several  implementation  issues 
that  are  likely  to  affect  the  performance. 

The  Smoothing  Window  Size  l:  It  is  clearly  possible  to  im¬ 
plement  the  algorithm  with  variable  smoothing  window  size. 
For  simplicity,  we  considered  the  fixed  window  size  case.  Al¬ 
though  not  necessary  for  P  >  2,  the  smoothing  window  size  l, 
in  theory,  upper  bounds  the  channel  order.  In  practice,  channel 
order  is  perhaps  fictitious,  and  we  can  always  argue  that  l  can 
never  upper  bound  the  “true”  channel  order.  Fortunately,  when 
hk  f-  0  for  k  >  l ,  the  performance  is  not  drastically  affected  as 
long  as  these  “spill-out”  coefficients  are  sufficiently  small.  In 
the  simulation  example  shown  in  Section  V,  the  robustness  of 
J-LSS  with  respect  to  the  underestimation  of  channel  order 
is  clearly  demonstrated.  In  such  a  case,  the  finite-sample 
convergence  property  is  lost  as  in  all  other  algorithms. 

Order  Selection  for  the  Predictors: 

In  selecting  the  order  w  for  the  forward  and  backward 
predictors,  we  should  observe  the  following  factors.  First,  for 
fixed  data  length,  large  w  implies  a  fewer  number  of  columns 
in  data  matrices.  This  corresponds  to  smaller  sample  size  in 
least  squares  problems.  In  this  regard,  it  is  desirable  to  choose 
w  as  small  as  possible,  which  is  the  reason  why  we  have 
considered  w  —  I  'm  the  algorithm.  Certainly,  if  P  >  2, 
a  smaller  w  can  be  choosen.  On  the  other  hand,  larger  w 
may  provide  a  certain  degree  of  robustness,  especially  when 
subchannels  have  zeros  approximately  common  near  the  unit 
circle.  It  is  clearly  possible  to  vary  the  predictor  size  w  with  k. 


V.  Simulation  Examples 


A.  Algorithm  Characteristics  and  Performance  Measure 

Simulation  studies  of  the  proposed  LSS  algorithms  as  they 
are  compared  with  existing  techniques  listed  in  Table  I  are 
presentd  in  this  section.  We  remark  that  only  J-LSS  does  not 
require  the  knowledge  of  channel  order  while  still  preserving 
the  finite  sample  convergence  property. 

Algorithms  are  compared  by  Monte  Carlo  simulation  us¬ 
ing  the  normalized  root  mean  square  error  (NRMSE)  as  a 
performance  measure.  Specifically,  NRMSE  is  defined  by 


NRMSE1  = 


m=  1 


(46) 


1  The  inherent  ambiguity  was  removed  before  the  computation  of  NRMSE. 
£(m) 

This  includes  scaling  h  and  adjusting  delays  by  adding  zeros  to  either 


Km) 


• 
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J-LSS  Order  Detection  and  Channel  Estimation 

1.  Choose  l  >  L  and  form  data  matrices  and  D^. 

2.  Obtain  the  4 1  orthogonal  basis  {ult  •  •  • ,  u4i}  that  spans  the  row  space  of  D(  /. 

3.  Obtain  the  projection  error  of  Y(j;  onto  spjui,  ■  •  • ,  u4(}: 


/Ul\ 


E,,=Y, 


-  Y/^U'U,  U: 


(66) 


Vu4i  / 

4.  For  each  1  <  k  <  l,  treated  as  the  estimated  channel  order,  let  Q  =  [Q0,  •  •  ■ ,  Q;]  be  matrix 
whose  rows  are  the  last  P(l  +  1)  —  l  +  k  —  1)  left  singular  vectors  of  E^;,  or  equivalently,  the 
sample  covariance  of  the  smoothing  error  sequence.  Form 


T*(  Q)= 


(  Qo  •  •  •  Qk\ 

Block 
V  Hankel  Q;  / 

5.  Joint  Order  Detection  and  Channel  estimation: 


(67) 


{L,  h}  =  arg  min  ||7fc(Q)h||2  (68) 

*.llhll=i 

Fig.  4.  J-LSS  algorithm. 


TABLE  I 

List  of  Algorithms  Compared  in  the 
Simulation  and  Their  Characteristics 


Abbreviation 

Name 

Convergence 

Order  Required? 

55  [7] 

The  Subspace  Algorithm 

finite 

Yes 

CR  [16],  [5] 

The  Cross  Relation  Algorithm^ 

finite 

Yes 

LP-SS[9] 

Linear  Prediction-Subspace  Algorithm 

finite 

Yes 

LP-LS  [1] 

Linear  Prediction- Least  Squares  Algorithm 

infinite 

No 

MSP  [4] 

Multistep  Linear  Prediction  Algorithm 

infinite 

No 

LSS  [14] 

Least  Squares  Smoothing  Algorithm 

finite 

Yes 

J-LSS 

Joint  Order  Detection  and  Channel  Estimation 

by  Least  Squares  Smoothing. 

finite 

No. 

■  i  j'f  m) 

where  h  was  the  estimated  channel  from  the  mth  trial. 
Noise  samples  are  generated  from  i.i.d.  zero  mean  Gaussian 
random  sequence,  and  the  signal-to-noise  ratio  (SNR)  was 
defined  and  given  by 


where  a2  was  the  noise  variance.  The  input  sequence  to  the 
channel  is  an  i.i.d.  quadrature  phase  shift  keying  (QPSK) 
complex  sequence. 

B.  Performance  Comparison:  A  Multipath  Channel 

Fig.  5  shows  the  NRMSE  performance  comparison  us¬ 
ing  a  four-ray  multipath  channel  generated  from  the  raised- 
cosine  pulse.  The  T/2-sampled  channel  parameters  are  given  in 
Table  II  with  even  and  odd  samples  corresponding  to  the  two 
subchannels.  This  channel  has  severe  intersymbol  interference. 
It  is  also  close  to  violate  the  identifiability  condition  in  the 
sense  that  the  filtering  matrix  (h)  has  condition  number 


NRMSE  vs.  SNR:  multipath  channel 


Fig.  5.  NRMSE  performance  comparison  for  the  multipath  channel.  One 
hundred  Monte-Carlo  runs.  One  hundred  input  symbols.  Legend:  SS:  CR: 
‘o’;  LP-SS:  *+’;  LP-LS:  MSP:  ‘x’;  LSS:  **’;  J-LSS: 


TABLE  II 
Multipath  Channel 


t 

1 

2 

3 

4 

h.t 

-0.0031  -  j0.0017 

-0.0016  -  j 0.0047  : 

-0.0109  -jO.0025 

-0.0263  -  j 0.0433 

t  1 

5 

6 

7 

8 

ht 

0.1522  + j  0.0705 

0.4409  +  j0.4736 

0.3789  +  j  0.5930 

0.0766  +  jO.  2 168 

t 

9 

10 

11 

12 

ht 

-0.0301  -  j0.0348 

-0.0042  -  jO.0154 

-0.0032  -  j 0.00 17 

-0.0017- j0.0044 

around  3  x  103.  We  have  also  performed  simulation  compar¬ 
isons  for  well-conditioned  channels  [18].  The  performance  of 
J-LSS  and  SS  are  comparable  in  those  cases. 


SNR  =  E< 
raz 
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J-LSS:  h_est  with  hjme  (multipath  channel)  SS:  h„esl  with  h_true  (multipath  channel) 


Fig.  6.  Scatter-plots  of  100  estimates  at  SNR  =  30  dB.  Solid  lines:  true  channel.  Left:  J-LSS  estimates.  Right:  SS  estimates. 


Observations  and  Discussions: 

•  J-LSS  performs  considerably  better  than  the  rest  of  the 
algorithms  although  its  behavior  is  somewhat  peculiar. 
Because  the  multipath  channel  has  small  head  and  tail 
taps,  correct  channel  order  detection  is  difficult.  Conse¬ 
quently,  J-LSS  almost  always  underdetermines  the  chan¬ 
nel  order  during  the  SNR  range  from  20-80  dB.  On 
the  other  hand,  it  is  perhaps  not  wise  to  estimate  these 
small  head  and  tail  taps.  Instead,  it  is  better,  as  J-LSS 
apparently  aims  to  do,  to  find  the  channel  order  as  well  as 
its  impulse  response  that  matches  the  data  in  some  optimal 
way.  Fig.  6  shows  the  scatterplot  at  30  dB  SNR  of  the 
magnitude  response  of  the  J-LSS  and  the  SS  algorithms. 
In  this  case,  the  J-LSS  algorithm  has  detected  channel 
order  L  —  1  (rather  than  the  true  channel  order  L  —  5). 
As  we  can  see,  the  J-LSS  algorithm  captures  the  four 
major  taps  of  the  channel  impulse  response.  In  contrast, 
when  the  true  channel  order  is  used  in  the  SS  algorithm, 
the  performance  of  the  estimator  is  rather  poor. 

•  It  appears  that  CR,  SS,  and  LSS  perform  comparably. 
Indeed,  all  of  them  assume  knowledge  of  the  channel  or¬ 
der,  and  all  have  the  finite  sample  convergence  property, 
although  this  shows  up  only  at  relatively  high  SNR.  From 
the  implementation  point  of  view,  the  advantage  goes  to 
the  LSS  algorithm  that  has  a  recursive  implementation 
both  in  time  and  in  channel  order  [17],  [18]. 

•  It  is  interesting  to  observe  that  when  the  channel  order  is 
correctly  detected  at  high  SNR,  J-LSS  is  slightly  worse 
than  CR,  SS,  and  LSS,  although  the  difference  eventually 
disappears  as  SNR  —  oo.  This  is  due  to  the  selection  of 
/  >  L  in  J-LSS,  which  reduces  the  effective  sample  size 
in  the  estimation. 

•  MSP  performs  better  than  LP-SS  and  LP-LS  because  it 
estimates  the  channel  in  a  single  step,  whereas  LP-SS 
and  LP-LS  estimate  ho  first.  For  this  multipath  channel, 
the  estimate  of  ho  is  rather  poor.  MSP  and  LP-LS  levels 
off  as  SNR  —  go  because  of  the  loss  of  finite  sample 
convergence.  The  floor  reduces  as  the  number  of  samples 
increases.  Note  also  that  LP-SS  does  indicate  finite  sample 
convergence,  although  its  breaking  point  occurs  about  20 
dB  higher  than  that  of  CR,  SS,  and  LSS. 


NRMSE  vs.  SNR:  Multipath  Channel  with  Underestimated  Order 


Fig.  7.  NRMSE  performance  comparison  for  the  multipath  channel.  One 
hundred  Monte  Carlo  Runs.  One  hundred  input  symbols.  Channel  order 

under-determined  by  1.  Legend:  SS:  ‘ - CR:  ‘o’;  LP-SS:  *+’;  LP-LS: 

MSP:  ‘x’;  LSS:  J-LSS: 

•  In  deriving  the  algorithm,  we  have  assumed  that  the 
smoothing  window  l  is  greater  than  the  channel  order. 
When  this  is  not  true,  it  is  interesting  to  test  the  robust¬ 
ness  of  J-LSS.  Fig.  7  shows  the  performance  of  these 
algorithms  when  the  channel  order  is  underestimated.  In 
this  simulation,  the  upper  bound  on  the  channel  order 
used  in  J-LSS  and  the  channel  order  used  in  all  other 
algorithms  are  underestimated  by  one.  We  see  that  J-LSS 
performs  better  than  all  other  algorithms  throughout  the 
entire  SNR  range.  The  flooring  effect  of  all  algorithms 
comes  from  the  underdetermination  of  the  channel  order. 

VI.  Conclusion 

We  have  presented  a  geometrical  approach  to  the  least 
squares  smoothing  algorithm  for  the  blind  estimation  of  mul¬ 
tichannel  finite-impulse  response  channels.  The  main  idea 
arises  from  the  isomorphic  relationship  between  the  input 
and  output  spaces,  which  serves  as  the  basis  of  smoothing 
and  linear  prediction-based  algorithms.  The  LSS  approach  to 
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channel  estimation  preserves  the  finite  sample  convergence 
property  critical  to  short  data  sample  applications.  The  main 
attraction  of  the  joint  channel  order  detection  and  channel 
estimation  algorithm  is  that  it  does  not  require  knowledge  of 
channel  order  and,  at  the  same  time,  preserves  the  finite  sample 
convergence  property.  There  are,  of  course,  several  weakness 
of  J-LSS.  It  requires  a  number  of  eigendecompositions  that 
can  be  computationally  expensive  (it  costs  about  Lu  times 
more  than  the  subspace  algorithms  where  Lu  is  the  upper 
bound  of  the  channel  order).  For  certain  channels,  the  joint 
order  detection  and  channel  estimation  approach  may  not  be 
as  effective  as  one  that  detect  the  channel  order  first  and 
implement  CR,  SS,  and  especially  LSS.  In  [17]  and  [18],  we 
explore  this  strategy  by  developing  time  and  order  recursive 
schemes  based  on  LSS. 


Appendix 

Proof  of  Property  3:  Denote 


Sk,t  = 


St 

\St-k+ 1  . 
St 


SN-2w-l-l+t 


Toeplitz 


(48) 


\St-k+ 1 

When  Sfc  has  linear  complexity  greater  than  2w  +  l  +  L, 

rank{52u)+f+i+L,2u)+f+i}  —  2w  + 1  +  L  +  1. 

From  (3),  when  there  is  no  noise,  we  have 

Zw,l  =  372w+l+l(h)S2w+l+l+L,2w+l+l 
Fw,l  — o:  (f/)S  tr+L.'2a'+l+L  •  Htr[  —  W trill)  S  tr+L.  1 r  - 

Because  of  PI. 2)  and  (49),  we  have  (34)-(38). 

To  prove  P3.4 ),  we  note  that 


(49) 


(50) 

(51) 


U W.l  - 


o 


wPx(w+L) 


wPx(w+L ) 

fw(h) 


£*w-\-L,w 


H 


Under  Al),  H  has  full  column  rank,  and  hence 

n{DWii}  =  n{  f^>+L,2W+i+i  \\ 

{  \&w+L,2w+l+l  J  J 


(52) 


(53) 


where  the  two  subchannels  are  f(z)  =  S^_0  /, ;;  k,g(z )  = 
S^_0  gtz~k.  Define 


/  fo 


\ 


ni(f)  ± 


II 


fo 


HlL  = 


l  h 

S - V - " 

l— L+l  columns 


Kiis))' 


(54) 


For  convenience,  by  rearranging  rows  of  Eu  we  have,  for 
l  >  L 


E; 


i,i 


£i,l{9) 


t-\-l — 


st\St. 


=  H,„ 


*t+l-L\Su 


(55) 


We  now  consider  the  equivalent  problem  of  finding  h  from 
the  column  space  of  Hi  £. 

Define 


q(x)  =  f0  4 - h  fL'XL  +  xl+1(g0  H - h  9lxL)  (56) 


and  let  {z i,  •  •  • ,  zi- i-l+i}  be  the  l  +  L  +  l  distinct  roots.  Then, 
columns  of  the  Vandermonde  matrix  V2;+2 
form  the  orthogonal  complement  of  the  column  space  of  H^l. 
We  now  consider  three  separate  cases: 

I)  k  —  L; 

II)  L  <  k  <  l; 

III)  k  <  L. 

Case  I-k  =  L:  In  this  case,  the  full  null  space  of  H\  L  is 
used.  Constructing  matrix  Q  from  singular  vectors  is  equiva¬ 
lent  to  that  from  V2;+2  (zl,  •  •  •  ,  Zl+L+ 1)>  whose  columns  span 
the  null  space  of  H\  L.  Therefore,  solving  h  from  (44)  is 
equivalent  to  solving 

V2!+2(*r ,  •  •  •  ,  4+L+i)'  )  =  0  (57) 

which,  after  removing  redundant  equations,  leads  to  solving 
the  homogeneous  equation  (58),  shown  at  the  top  of  the  next 
page.  Because  the  roots  { z, }  are  distinct,  the  solution  of  the 
above  is  unique. 

Case  II:  In  this  case,  matrix  Q  is  constructed  from  the 
entire  null  space  of  H'i  L  along  with  k  —  L  eigenvectors  in 
the  range  space  of  H^l.  In  other  words 

%{Q)  =  G(^y^  (59) 


Because  of  (49),  we  have  (40).  □  □  □ 

Proof  of  Theorem  2:  When  channels  do  not  share  common 
zeros,  it  is  sufficient  to  consider  the  case  for  P  —  2.  Let 


h(z)  = 


where 

G  full-rank  matrix; 

corresponds  to  the  matrix  constructed  from  columns 
of  V2I+2(Ai  ,  •  •  •  ,  Zl+L- |-l) 

V  matrix  associated  with  the  (/,:  —  L)  eigenvectors  in  the 
range  space  of  H^l. 
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(64) 


To  show  that  (44)  has  only  the  trivial  solution,  it  is  sufficient 
to  show  that 


Treating  k  as  the  estimated  channel  order,  T;j.  has  the 
form  of  (60),  shown  at  the  top  of  the  page.  We  note  that 
•  •  • ,  Z{+L+1)  has  full  column  rank.  Consequently, 
the  first  k  +  L  +  1  columns  of  T)  /,.  are  linearly  independent. 
Because  { Zj }  are  distinct  roots  of  q(x),  all  other  columns  of 
T;j;  are  linearly  dependent  on  the  first  k  +  L  +  1  columns 

rank{T;^J  =  k  +  1  +  L.  (61) 

Next,  when  k  >  L,  k  —  L  vectors  from  the  range  space  of  Hf  i 
are  used  to  form  Q ,  where  each  vector  introduces  l  —  k  +  1 
rows  in  the  {k  —  L)(l  —  k  +  1)  X  2{k  +  1)  matrix  V.  For 
generic  channels,  these  vectors  are  linearly  independent  among 
themselves  and  linearly  independent  to  rows  in  (/)  /..  i.e.. 


=  min{2(/t  +  1),  k  +  1  +  L  +  (k  —  L)(l  —  k  +  1)} 

=  2  k  +  2.  (62) 

Therefore,  rank{7^(Q)}  =  2k  +  2;  hence,  (44)  has  only  the 
trivial  solution. 

Case  III — k  <  L:  In  this  case,  again  treating  k  as  an  es¬ 
timated  channel  order,  /  +  k  +  1  null  space  vectors  will  be 
used  in  forming  Q.  Since  V21+2  zf+L+1)  forms  the 

orthogonal  complement  of  the  range  space  of  H[:l,  we  have 

Q  =  GV2l+2(zl,z4,---,z4+L+iy  (63) 


where  G  is  a  (7  +  k  +  1)  X  (l  +  L  +  1)  matrix  with  full 
row  rank.  Since  the  singular  vectors  used  to  form  Q  are 
associated  with  the  repeated  zero  singular  value,  G  can  be 
considered  to  be  a  randomly  generated  (7  +  A;-|-l)x(7-|-.L-|-l) 
matrix.  Further,  we  have  (64),  shown  at  the  top  of  the  page. 
Because  k  <  L.  rank {(/)  ;. }  =  2/>:  +  2.  For  randomly  generated 
T  k  T  1)  X  ((4741)  matrix  G 

rank{[Q0,...,Q,]}  =  2fc  +  2.  (65) 

Therefore,  (44)  has  only  the  trivial  solution.  □  □  □ 
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